31 research outputs found
Fault Tolerant Clustering Revisited
In discrete k-center and k-median clustering, we are given a set of points P
in a metric space M, and the task is to output a set C \subseteq ? P, |C| = k,
such that the cost of clustering P using C is as small as possible. For
k-center, the cost is the furthest a point has to travel to its nearest center,
whereas for k-median, the cost is the sum of all point to nearest center
distances. In the fault-tolerant versions of these problems, we are given an
additional parameter 1 ?\leq \ell \leq ? k, such that when computing the cost
of clustering, points are assigned to their \ell-th nearest-neighbor in C,
instead of their nearest neighbor. We provide constant factor approximation
algorithms for these problems that are both conceptually simple and highly
practical from an implementation stand-point
Avoiding the Global Sort: A Faster Contour Tree Algorithm
We revisit the classical problem of computing the \emph{contour tree} of a
scalar field , where is a
triangulated simplicial mesh in . The contour tree is a
fundamental topological structure that tracks the evolution of level sets of
and has numerous applications in data analysis and visualization.
All existing algorithms begin with a global sort of at least all critical
values of , which can require (roughly) time. Existing
lower bounds show that there are pathological instances where this sort is
required. We present the first algorithm whose time complexity depends on the
contour tree structure, and avoids the global sort for non-pathological inputs.
If denotes the set of critical points in , the running time is
roughly , where is the depth of in
the contour tree. This matches all existing upper bounds, but is a significant
improvement when the contour tree is short and fat. Specifically, our approach
ensures that any comparison made is between nodes in the same descending path
in the contour tree, allowing us to argue strong optimality properties of our
algorithm.
Our algorithm requires several novel ideas: partitioning in
well-behaved portions, a local growing procedure to iteratively build contour
trees, and the use of heavy path decompositions for the time complexity
analysis
On the Complexity of Randomly Weighted Voronoi Diagrams
In this paper, we provide an bound on the expected
complexity of the randomly weighted Voronoi diagram of a set of sites in
the plane, where the sites can be either points, interior-disjoint convex sets,
or other more general objects. Here the randomness is on the weight of the
sites, not their location. This compares favorably with the worst case
complexity of these diagrams, which is quadratic. As a consequence we get an
alternative proof to that of Agarwal etal [AHKS13] of the near linear
complexity of the union of randomly expanded disjoint segments or convex sets
(with an improved bound on the latter). The technique we develop is elegant and
should be applicable to other problems
Fast Clustering with Lower Bounds: No Customer too Far, No Shop too Small
We study the \LowerBoundedCenter (\lbc) problem, which is a clustering
problem that can be viewed as a variant of the \kCenter problem. In the \lbc
problem, we are given a set of points P in a metric space and a lower bound
\lambda, and the goal is to select a set C \subseteq P of centers and an
assignment that maps each point in P to a center of C such that each center of
C is assigned at least \lambda points. The price of an assignment is the
maximum distance between a point and the center it is assigned to, and the goal
is to find a set of centers and an assignment of minimum price. We give a
constant factor approximation algorithm for the \lbc problem that runs in O(n
\log n) time when the input points lie in the d-dimensional Euclidean space
R^d, where d is a constant. We also prove that this problem cannot be
approximated within a factor of 1.8-\epsilon unless P = \NP even if the input
points are points in the Euclidean plane R^2.Comment: 14 page
In pursuit of linear complexity in discrete and computational geometry
Many computational problems arise naturally from geometric data. In this thesis, we consider three such problems: (i) distance optimization problems over point sets, (ii) computing contour trees over simplicial meshes, and (iii) bounding the expected complexity of weighted Voronoi diagrams. While these topics are broad, here the focus is on identifying structure which implies linear (or near linear) algorithmic and descriptive complexity.
The first topic we consider is in geometric optimization. More specifically, we define a large class of distance problems, for which we provide linear time exact or approximate solutions. Roughly speaking, the class of problems facilitate either clustering together close points (i.e. netting) or throwing out outliers (i.e pruning), allowing for successively smaller summaries of the relevant information in the input. A surprising number of classical geometric optimization problems are unified under this framework, including finding the optimal k-center clustering, the kth ranked distance, the kth heaviest edge of the MST, the minimum radius ball enclosing k points, and many others. In several cases we get the first known linear time approximation algorithm for a given problem, where our approximation ratio matches that of previous work.
The second topic we investigate is contour trees, a fundamental structure in computational topology. Contour trees give a compact summary of the evolution of level sets on a mesh, and are typically used on massive data sets. Previous algorithms for computing contour trees took Θ(n log n) time and were worst-case optimal. Here we provide an algorithm whose running time lies between Θ(nα(n)) and Θ(n log n), and varies depending on the shape of the tree, where α(n) is the inverse Ackermann function. In particular, this is the first algorithm with O(nα(n)) running time on instances with balanced contour trees. Our algorithmic results are complemented by lower bounds indicating that, up to a factor of α(n), on all instance types our algorithm performs optimally.
For the final topic, we consider the descriptive complexity of weighted Voronoi diagrams. Such diagrams have quadratic (or higher) worst-case complexity, however, as was the case for contour trees, here we push beyond worst-case analysis. A new diagram, called the candidate diagram, is introduced, which allows us to bound the complexity of weighted Voronoi diagrams arising from a particular probabilistic input model. Specifically, we assume weights are randomly permuted among fixed Voronoi sites, an assumption which is weaker than the more typical sampled locations assumption. Under this assumption, the expected complexity is shown to be near linear
From Proximity to Utility: A Voronoi Partition of Pareto Optima
We present an extension of Voronoi diagrams where when considering which site
a client is going to use, in addition to the site distances, other site
attributes are also considered (for example, prices or weights). A cell in this
diagram is then the locus of all clients that consider the same set of sites to
be relevant. In particular, the precise site a client might use from this
candidate set depends on parameters that might change between usages, and the
candidate set lists all of the relevant sites. The resulting diagram is
significantly more expressive than Voronoi diagrams, but naturally has the
drawback that its complexity, even in the plane, might be quite high.
Nevertheless, we show that if the attributes of the sites are drawn from the
same distribution (note that the locations are fixed), then the expected
complexity of the candidate diagram is near linear.
To this end, we derive several new technical results, which are of
independent interest. In particular, we provide a high-probability,
asymptotically optimal bound on the number of Pareto optima points in a point
set uniformly sampled from the -dimensional hypercube. To do so we revisit
the classical backward analysis technique, both simplifying and improving
relevant results in order to achieve the high-probability bounds